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Abstract. Let Q n denote a random symmetric n by n matrix, whose upper 
diagonal entries are i.i.d. Bernoulli random variables (which take values and 
1 with probability 1/2). We prove that Q n is non-singular with probability 
1 — 0(n~ 1 / 8 +' 5 ) for any fixed <5 > 0. The proof uses a quadratic version 
of Littlewood-Offord type results concerning the concentration functions of 
random variables and can be extended for more general models of random 
matrices. 



1. Introduction 

Let A n denote a random n by n matrix, whose entries are i.i.d. Bernoulli random 
variables, which take values and 1 with probability 1/2. A basic question is the 
following 

Question 1.1. Is it true that A n is almost surely non-singular ? 

Here and later we say that an event holds almost surely if it holds with probability 
tending to one as n tends to infinity. 

The above question was answered affirmatively by Komlos in 1967 [5]. Later, 
Komlos generalized the result (to more general models of random matrices) [6] 
and also simplified the proof [1]. In a recent paper [7], Tao and Vu found a different 
proof which leads to a sharp estimate on the absolute value of the determinant of 
A n - 

Another popular model of random matrices is that of random symmetric matrices; 
this is one of the simplest models that has non-trivial correlations between matrix 
entries. Let Q n denote a random symmetric n by n matrix, whose upper diagonal 
entries < i < j < n) are i.i.d. Bernoulli random variables. It is natural to 

ask 

Question 1.2. Is it true that Q n is almost surely non-singular ? 



As far as we can trace, this question was first posed by Weiss in the early nineties. 
Despite its obvious similarity to Question 1.1, we do not know of any partial results 
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concerning this question, prior to this paper. A significant new difficulty is that 
the symmetry ensures that the determinant det(Q„) is a quadratic function of each 
row, as opposed to det(A n ) which is a linear function of each row. 

The goal of the current paper is to give an affirmative answer to Question 1.2. 
Theorem 1.3. Q n is almost surely non-singular. More precisely 

Pn := P(Q„ is singular) = 0{n- l ' &+s ) 

for any positive constant 5 (the implicit constant in the OQ notation of course is 
allowed to depend on S). 

Remark 1.4. The exponent —1/8 + 5 can be improved somewhat by tightening 
the calculation and applying more technical arguments. However, to improve the 
bound to an exponential bound (in the spirit of [4]) seems to require new ideas; see 
Section 7. 

The rest of the paper is organized as follows. In the next section, we present 
our approach and the key lemmas. The lemmas will be discussed in Sections 3-5. 
Section 6 is devoted to the generalization of the result to other models of random 
matrices. We conclude by Section 7 which contains several open questions. 

Notation. In the whole paper, we assume that n is large, whenever needed. The 
asymptotic notations are used under the assumption that n — > oo. E and Var de- 
note expectation and variance, respectively; log denotes the logarithm with natural 
base. 

2. The approach and main lemmas 

As mentioned above, there are now three different proofs of Komlos 1967 result on 
the non-singularity of A n . The simpler ones are [1] and [7]. But the original (and 
longest) proof from [5] is what really inspires us. The key difference between these 
proofs lies in the ways one generates A n . In the proofs from [1] and [7] one builds 
up A n by exposing the row vectors one by one and making use of the independence 
of these vectors. This approach, unfortunately, is no longer effective for Q ni as the 
last few rows are almost deterministic once one has exposed all rows above them. In 
[5], one builds up A n by taking and adding a (random) row and a (random) 

column. This idea turns out to be useful for the consideration of Q n . However, for 
Q n the additional row and column are not independent. They are transposes of 
each other and this has become the main obstacle. We have managed to overcome 
this obstacle by developing a quadratic variant of Littlewood-Offord type results 
concerning the concentration of random variables (see Section 4) . 

The basic strategy is to relate the rank of Q n with the rank of Q n +i- Assume that 
we get Qn+i by adding a new column and its transpose as a new row to Q n - Our 
starting point is the following simple observation 
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rank(Q„) < rank(Q„ + i) < rank(Q„) + 2. (1) 

We shall refine this by showing that if Q n is singular (so rank(Q„) < n), then 
rank(Q n +i) will equal rank(Q n ) + 2 with high probability; similarly, if Q n is non- 
singular (so rank(Q„) = n), then rank(Q n+ i) will equal rank(Q„) + 1 = n+ 1 with 
high probability. These two results together will then be easily combined with an 
inductive argument to show that rank(Q„) = n with high probability. 

We now turn to the details. Let us fix a small positive constant e > 0. We allow the 
implicit constants to depend on e, and we will assume that n is sufficiently large 
depending on e. Set 

N-.^n 1 -". (2) 

Definition 2.1. Given m vectors {vi,v 2 , ...,v m }, a linear combination of the Vi is 
a vector v = c\V\ + . . . c m v n , where the Cj are real numbers. We say that a linear 
combination vanishes if v is the zero vector. A vanishing linear combination has 
degree k if exactly k among the Ci are non-zero. We call a singular n by n matrix 
normal if its row vectors do not admit a non-trivial vanishing linear combination 
with degree less than N. Otherwise we call the matrix abnormal. 

Remark 2.2. We use the terms normal and abnormal only when the matrix in 
question is singular. These terms are not defined (and we don't need them) for 
non-singular matrices. 

In Section 3 we shall prove that most singular matrices are normal: 
Lemma 2.3. The probability that Q n is singular and abnormal is 0((2/3) n ). 

In Section 5 we shall prove 

Lemma 2.4. Let A be a (deterministic) n by n singular normal matrix, and let A' 
be the n+1 by n + 1 matrix formed by augmenting A by a random vector of length 
n+1 and its transpose. Then 

P(rank(A') - rank(A) < 2) = 0{N-^ 2 ) 

and thus 

P(rank(A') - rank(^) = 2) = 1 - 0(iY~ 1/2 ). 

Intuitively, these two lemmas state that in most cases, augmenting a singular matrix 
by a random vector and its transpose will increase the rank by exactly 2. Note that 
by Bayes' identity, Lemma 2.4 automatically generalizes to matrices A which are 
random instead of deterministic, as long as the random vector which is augmenting 
A is independent of A. 

We now develop analogues of the above two lemmas for non-singular matrices. 

Definition 2.5. A row of an n by n non-singular matrix is called good if its 
exclusion leads to an (n — 1) x n matrix whose column vectors admit a nontrivial 
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vanishing linear combination with degree at least N. (In fact, there is exactly one 
such combination-up to scaling-as the rank of this (n — 1) x n matrix is n — 1.) A 
row is bad otherwise. We say that aiinxii non-singular matrix A is perfect if every 
row in A is a good row. If a non-singular matrix is not perfect, we call it imperfect. 

Remark 2.6. We use the terms perfect and imperfect only when the matrix in 
question is non-singular. These terms are not defined for singular matrices. 

In Section 3 we shall prove that most non-singular matrices are perfect: 

Lemma 2.7. The probability that Q n is both non-singular and imperfect is 0((2/3) n ). 

In Section 5 we shall prove the following analogue of Lemma 2.4: 

Lemma 2.8. Let A be a (deterministic) non- singular perfect symmetric matrix of 
size n, and let A' be the (n + 1) x (n + 1) matrix formed by augmenting A by a 
random (n+l)-vector of Os and Is, and its transpose. Then 

P (rank(A') = rank(A)) = 0(7W 1/8 ) 

for any positive constant 5, where the implicit constant can of course depend on 6. 
In particular, since 

n = rank(A) < rank(A') < n + 1 

we see that 

P (rank(A') - rank(A) = 1) = 1 - O^ 1 / 8 ). 

The last two lemmas are the non-singular counterparts of the first two. Together, 
they state that if a matrix already has full rank, augmenting it will typically produce 
another matrix of full rank. Again, we can automatically generalize Lemma 2.8 to 
the case when A is random and independent of the augmenting row. 

Let us assume these lemmas for the moment and conclude the proof of Theorem 
1.3. 

Consider a random matrix Q n . We embed it into a sequence {Qi, Q2, •••} of random 
matrices, where Q n +i is formed from Q n by adding a random vector of Os and Is 
(independent of Q n ) of length n + 1 as the last column, and its transpose as the 
last row. 

Define the (somewhat artificial) random variable X n by setting X n = if Q n is 
non-singular (thus rank(Q„) = n), and X n = (i.i)™- ran k(Q re ) otherwise. Thus 
X n ranges between and (1.1)™. We have the following decay estimate for the 
expectation E(A„) of X n . 

Lemma 2.9. E(X n+1 ) < 0.99E(X„) + O^- 1 /®). 

Proof For any < j < n, let Aj be the event that Q n has rank n — j, and that 
A n is neither abnormal (if j > 0) nor imperfect (if j = 0). By Bayes' identity and 
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Lemmas 2.3, 2.7, we have 

n 

nx n ) = 5>-l) J 'P(A,0 + 0((l.l)"(2/3)") 

3=0 

and 

n 

E(X„ +1 )=^E(X„ +1 |^)P(^) + 0((l.l)"(2/3r). 

Now let us condition on the event Aq, thus Q n is non-singular. From Lemma 2.7 
we see that Q n +i has rank n with probability O^ -1 / 8 ), and rank n+ 1 otherwise. 
Thus 

E(X Il+1 |A ) = O(7V- 1 / 8 ). 

Now let 1 < j < n and condition on the event Aj, thus Q n is singular with rank 
n — j. From Lemma 2.3 and Lemma 2.4 we see that Q n +i has rank n — j + 2 with 
probability 1 — (^(A^ 1 / 2 ), and has rank n — j or n — j + 1 otherwise. Thus 

E(X n+1 \Aj) < E ( 2 ™+ 1 - rank (Q"+i)|^.) 

< {l.iy- 1 + o(N- 1 ' 2 ){i.iy +1 

< 0.99(1. ly 

if N = n 1 ~ e is large enough. Putting all these estimates together, and noting that 

0((l.l)"(2/3)") = 0(iV- 1/8 ), 
we obtain the claim. ■ 

From the above lemma and an easy induction, we see that 

E(X n ) = OiN- 1 / 8 ) 
for all large n. From Markov's inequality we then see that 

P(Q„ singular) = P(X n > 1) = 0(7V~ 1//8 ). 
Theorem 1.3 then follows from the definition of N. 

It remains to prove Lemmas 2.3-2.8. This will be done in the next few sections. 
Among these lemmas, the first three are variants of lemmas from [5] and are rela- 
tively simple. The proof of Lemma 2.8 is a somewhat more complicated and requires 
a new machinery, discussed in Section 4. 



3. Proof of Lemmas 2.3 and 2.7 

The two proofs are similar and rely on the following simple observation from [5] 
(which has also been used in [4], [7], [8]): 

Lemma 3.1. Let H be a linear subspace in K™ of dimension at most d < n. Then 
it contains at most 2 d vectors from {0, 1}™. 
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Proof The space H is spanned by the row vectors of a d! x n full-ranked matrix, 
where d! := dim(H) < d. This matrix has at least one non-singular d' x d! minor, 
thus there exists a set of d! co-ordinates of W 1 which can be used to parameterize 
H . But in {0, 1}™, these co-ordinates take only 2 d < 2 d values, and the claim 
follows. ■ 



Proof [of Lemma 2.3] For any 1 < d < n. Let g(n, d) be the probability that the 
row vectors of Q n admit a nontrivial vanishing linear combination of degree d. For 
d = 1 we have the easy bound g(n, 1) < n2~™, since g(n, d) is simply the probability 
that one of the rows of Q n is entirely zero. Now take d>2. To bound g(n, d) from 
above, notice that by symmetry and the union bound we have the crude estimate 

g(n,d) < y\h{n,d) < n d h(n,d) 



where h(n, d) is the probability that the first d rows admit such combination. This 
means that if we fix the first d — 1 row vectors, then the d th row vector lies in the 
subspace spanned by these vectors. The same claim is true if we delete the first 
d — 1 columns from Q n (we need to do this as Q n is symmetric) . The remaining 
entries in the d th row vector are now distributed independently in {0, and 
so by Lemma 3.1 the probability of lying in the span of the first d — 1 row vectors 
is at most is at most 2 d ~ 1 /2"~ d ~ 1 . Thus the probability that Q n is singular and 
abnormal is at most 

N N 

#( n ' d ) ^ n2 ~ U + 2 n d+1 2 d - 1 /2 n - d - 1 = 0((2/3)™) 

d=l d=2 

by the definition (2) of N. ■ 



Proof [of Lemma 2.7] Let b(n) be the probability that the last row of Q n is bad. 
By symmetry and the union bound, the probability that Q n is non-singular and 
imperfect is at most nb{n). We can bound b(n) using the same argument as in the 
previous proof, with a slight modification; the column vectors have length n — 1 so 
we need to replace n by n — 1, but this does not affect the bound. We omit the 
details. ■ 



4. A QUADRATIC LiTTLEWOOD-OFFORD INEQUALITY 



Let us start by the following classical result, proved by Erdos, which strengthens 
an earlier result of Littlewood and Offord. 

Theorem 4.1 (Linear Littlewood-Offord inequality). [2] Let zi,... ,z n be i.i.d. 
random variables which take values and 1 with probability 1/2. Let a\, . . . ,a n be 
real deterministic coefficients, with |aj| > 1 for at least k > 1 values of i. Then for 
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any interval I C R of length 1, we have 

p (l> z * e/ ) =o(V 1/2 ) 

where the implied constant is absolute. 

Roughly speaking, the theorem says that linear random sums cannot concentrate 
on small intervals if the coefficients of the underlying linear form are large. 

Remark 4.2. There are a number of far reaching generalizations and interesting 
refinements of Theorem 4.1 (see e.g. [3] and the references therein). We mention 
some rather trivial ones here (which we will need later) . Firstly we can replace the 
unit interval I by any other interval of length 0(1) (at the cost of changing the 
implied constant in 0((1 + fc) -1 / 2 ), of course), by covering such an interval by unit 
intervals. Similarly, we may scale the constraint > 1 and replace it by \cn\ > c 
for some other c > 0, again at the cost of letting the implied constant depend 
on c. Finally, one can replace the distribution of the Zi with the distribution 
P(zi = 0) = a, P(zi = 1) = 0,P(zi = —1) = 7, where a, (3, 7 are non negative 
constants summing up to one and a < 1. The implied constant will then of course 
depend on a, (i, 7. 

To conclude the proof of Theorem 1.3, we need to generalize Theorem 4.1 in a 
direction different from what has been done before. Instead of considering a linear 
form, we are going to consider a quadratic form of the z%. (In fact, our method works 
for polynomials of any fixed degree, by iterating the argument below.) Consider 
random variables Zi as in Theorem 4.1 and define the quadratic random variable 

Q •= ^ ' CijZiZj. (3) 

l<i j'<n 

The main result of this section is the following quadratic generalization of Theorem 
4.1. 

Theorem 4.3 (Quadratic Littlewood-Offord inequality). Let the quadratic random 
variable Q be as in (3), let {1, . . . , n} = U\ U U2 be any non-trivial partition, and 
let S be any non-empty subset ofU\. For each i <E S, let di be the number of indices 
j E U2 such that |c,j| > 1. Suppose that di > 1 for each i e S. Then for any 
interval I of length 1, we have 

p(q e i) = o nsr 1 / 2 + \s\- 1 Y / d 7 1/2 

The implied constant is absolute. 

It is unlikely that the bound on the right-hand side is best possible, but for us, 
any bound which decays to zero when the number of large coefficients Cij goes to 
infinity will suffice. 

The proof of Theorem 4.3 is lengthy and will be given later. Assuming it for the 
moment, we have the following corollary: 
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Corollary 4.4. Let Q be as in (3), and suppose that there is a set U C {1, . . . , n} of 
cardinality \U\ > m > 2 such that for each i G U, there are m indices j G {1, . . . , n} 
where |cjj| > 1. Then for any interval I of length 1 

P(Q G /) < 0(m- 1 ^). 
The implied constant is absolute. 



Proof Without loss of generality we may take m to be even. Let U\ be an arbitrary 
subset of U of cardinality to/2 and write U2 '■= {1, • • • , n}\Ui, then for any i G U\ 
there exists at least to/2 indices j G U2 for which |cy| > 1. Applying Theorem 4.3 
with S := Ui, we conclude 

P(Q G /) = 0((to/2)- 1 / 2 + (to/2)- 1 (W2) _1/2 ) 1/4 
as desired. ■ 



By rescaling the above corollary we obtain the following discrete version. 

Corollary 4.5. Let Q be as in (3), and suppose that there are at least to indices i 
such that for each i there are to indices j where |cy| 7^ 0. Then 

P(g = 0) = O(TO- 1 / 8 ) 

where the implied constant is absolute. 



This Corollary will be the one we use to establish Lemma 2.8. 



4.6. Proof of Theorem 4.3. We now prove Theorem 4.3. As a first attempt to 
prove this theorem, one might try to view the quadratic form Q as a linear form 

n 

Q = Y.Q^ ( 4 ) 

i=l 

where the coefficients Qi are themselves linear form random variables Qi := X^/=i c ij z j ■ 
Thus one might hope to obtain Theorem 4.3 from two applications of Theorem 4.1. 
Unfortunately, there is a serious obstruction to this strategy, because the coeffi- 
cients Qi, . . . , Q n are not independent of the variables Zi, . . . , z n . However, we can 
get around this obstacle by the following decoupling lemma, which relies on the 
Cauchy-Schwarz inequality. 

Lemma 4.7 (Decoupling lemma). Let X and Y be random variables and E = 
E{X 1 Y) be an event depending on X and Y. Then 



P{E(X, Y)) < P{E(X, Y) A E(X', Y) A E(X, Y') A E(X', F')) 1/4 



where X' and Y' are independent copies of X and Y , respectively. Here we use 
Ah B to denote the event that A and B both hold. 
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Remark 4.8. This lemma is a probabilistic analogue of the well-known result in ex- 
tremal graph theory, that if a bipartite graph connecting n and m vertices contains 
at least enm edges for some < c < 1, then it also contains at least c 4 n 2 m 2 copies 
of the four-cycle C4, where we include degenerate four-cycles. Indeed, the two re- 
sults are easily shown to be equivalent. This decoupling lemma also plays the role 
of the van der Corput lemma used in Weyl's estimation of exponential sums with 
quadratic (or more generally polynomial) phases; indeed it is quite likely that one 
could obtain an estimate very similar to Theorem 4.3 by means of these techniques 
(combined with Esseen's concentration inequality), however we have chosen a more 
elementary combinatorial approach here. 

Proof Let us first consider the case when X takes a finite number of values 
Xi,. . . ,x n and Y takes a finite number of values y\, . . . , y m . From Bayes' identity 
we have 

n 

P(E(X, Y)) = P(E(x u Y))P(X = Xi ) 

i=l 

and 

n 

P(E(X, Y) A E(X, Y 1 )) = E P(E(x u Y)) 2 P(X = x{] 

i=l 

and hence by the Cauchy-Schwarz inequality 

P(E(X, Y)) < P{E(X, Y) A E(X, Y')) 1/2 . 

Similarly, we have 

P(E(X,Y)AE(X,Y')) = E E *WX, yj ) A E(X,y r ))P{Y = y 3 )P(Y = y r ) 

j=i j'=i 

and 

P(E(X, Y) A E(X, Y') A E(X', Y) A E(X', Y')) 

m m 

= E E P(£(^%)A£(I,!//)) 2 P(y = y 3 )P(Y = y r ) 

j=l j'=l 

so by Cauchy-Schwarz again 

P(E(X, Y) A E(X, Y')) < P(E(X, Y) A E(X, Y') A E(X', Y) A E(X', Y')) 1 ' 2 . 

Combining these two applications of Cauchy-Schwarz, we obtain the claim. The 
general case when X and Y could be take a countable or uncountable number 
of values then follows, either by a discretization argument, or by replacing the 
sums with integrals and using Fubini's theorem; we omit the details, since for our 
application we only need the case when X, Y take finitely many values. ■ 

We return to the task of proving Theorem 4.3. Let Z £ {0,1}™ be the random 
variable (z\, . . . , z n ). Consider the quadratic form Q(Z) = Q(zi, . . . , z n ) defined by 
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(3), and fix a non-trivial partition {1, . . . , n} = U\ U U2 and a non-empty subset S 
oiU\. Let I be an interval of length 1. We need to prove that 

P(Q(Z) e I) 4 = 0{\S\- 1 ' 2 + isr 1 ^ 172 )- 

Define X := (z i ) i£ u 1 and Y := (2»)iet/ 2 - We can write Q(Z) = Q(X,Y). Let z[ 
be an independent copy of Zi and set X' := (z' i ) ie u 1 and Y' := (z' i )i e u 2 )- Applying 
Lemma 4.7, we see that it suffices to show that 

P(Q(X, Y), Q(X, Y'),Q(X', Y), Q(X>, Y 1 ) e I) = 0(\S\-^ 2 + l^" 1 ]T d; 1/2 ). 

A simple calculation shows that the random variable 

R := Q(X, Y) - Q(X', Y) - Q{X, Y') + Q(X\ Y') 
can be written as 

ie(/i j<au 2 

where for i e U\, Wi is the random variable Wi := Zi — z[, and Ri is the random 
variable 

We have eliminated the coupling problem in the factorization (4), because the 
random variables (Ri)i^Ui & re independent of the random variables (wi)i e ui- 

Consider the four events Q(X, Y) e I, Q(X', Y) e I, Q(X, Y 1 ) G / and Q(X', Y') G 
/. If all of these hold, then R lies in the interval J := 21 — 21 of length 4. Thus, it 
suffices to show that 

P(Re.J) = 0(\S\-^ + \S\- 1 J2 d I 1/2 )- 

Recall that for each i G U\, di be the number of coefficients j G £7i for which 
\cij\ > 1. For each ieSC U\, we may apply Theorem 4.1 (and Remark 4.2) to 
the random variable Ri to obtain 

P(| J R 4 |<l) = OK- 1/2 ). (5) 

By the union bound we thus have the crude estimate 

P(\Ri\ > 1 for all * G S) = 1 - 0(^d~ 1/2 ). 

This use of the union is somewhat wasteful and we can do better by invoking the 
second moment method. For each i G S, let Ii be the indicator variable of the 
event \Ri\ > 1, thus Ii — 1 when \Ri\ > 1 and Ii — otherwise. Thus (5) can be 
rewritten as 

E(I t ) = l-0(^ 1/2 ) 
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and hence by linearity of expectation 

E(]>>)Hsi-o(]rdr 1/2 ). 

ies ies 

Also, since di > 1, we have at least one j G U2 for which \cij\ > 1, which easily 
implies that E(ii) > 1/2. Thus we also have 

E(5>)>|S|/2. 

ies 

Next we compute the variance of J2ies ^' 

Var£>) = E(£>) 2 )-E(£^) 2 

ies ies ies 

<\sf-(\s\-o(j2d-^)) 

\ies / 

= 0(\S\J2d; 1/2 ). 

ies 

By Chebyshev's inequality, we conclude 

pEA<>< " Vta g" J ' ) -cx i | i i:^). 

Thus with probability 1 — 0(rgf Sies ^i"^ 2 )' we have |i£j| > 1 for at least |<S|/4 
values of i G S. 

Let us now temporarily condition the i?j to be fixed for alii G U\, and assume that 
\Ri\ > 1 for at least \S\/4 values of i € S. Applying Theorem 4.1 (and Remark 4.2) 
to R = X)ie(7i w i^i (treating the Ri as fixed coefficients), we have the conditional 
probability estimate 

P(Re J\R t fixed ) =0(\S\- 1/2 ). 
By the preceding discussion and Bayes identity, we thus have 

p(i?G j) = o(\s\- 1 / 2 ) + o(±-J2 d * 1/2 ) 

' ' ies 

as desired. ■ 

5. Proof of Lemmas 2.4 and 2.8 

Proof of Lemma 2.4. Let A be a normal symmetric singular nxn matrix of rank 
d. Let Vi be the ith row vector of A. Without loss of generality, we can assume that 
Vi,...,Vd are linearly independent. Thus, the last row vector v n can be written as 
a linear combination of these vectors in a unique way 

d 

»=1 
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As A is normal, by definition at least N among the coefficients a are non-zero. 

Consider the addition of a random (0, 1) column of length n to A. Each of the 
row vectors receives a new (random) coordinate and becomes a new vector v[. 
Clearly, v[, . . . , v' d are still independent. If the new matrix A' fails to have a larger 
rank than A, then the last row v' n must remain within the span of v[ , . . . , v' d . By 
considering the first n coordinates, the only way this can happen is if 

d 

i=i 

This implies that the last coordinate x n+ \ of v' n satisfies 

d 

x n+1 = ^2c i yl l+1 , (6) 
i=i 

where y l n+1 is the last coordinate of v[. Since x n+ \ and y l n+1 arc i.i.d (0, 1) random 
variables and at least N of the Cj are non-zero, Theorem 4.1 (see also Remark 4.2) 
implies that the probability that (6) holds is 0(N^ 1 ^ 2 ). Thus, we can conclude 
that with probability 1 — 0(N^ 1 / 2 ), the new column increases the rank by one. 
If adding the new column increases the rank by one, then by the fact that A is 
symmetric, adding the column and its transpose as a new row increases the rank 
of A by 2 (regardless the value of the last diagonal entry) , concluding the proof. □ 

Proof of Lemma 2.8. Let A be a perfect non-singular symmetric matrix of order 
n. Let A' be the n + 1 be n + 1 symmetric matrix obtained from A by adding a 
new random (0,1) column u of length n + 1 as the n + 1st column and its transpose 
as the n + 1st row. 

Let x\,... ,x n+ i be the coordinates of u; x n+ i is the low-right diagonal entry of 
A'. The determinant for A' can be expressed as 



(det A)x n+ i + ^2 CijXiXj 
i=i 

where Cy is the ij cofactor of A. We can rewrite det A' as 

Q(xi, . . . ,x n+1 ) = (dctA)xl +1 + ^2 c ij x i x j 

thanks to the fact that x 2 n+l = x n+ \. We are going to bound the probability that 
Q = 0. 
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In order to apply Corollary 4.5, we next show that for each 1 < i < n, many among 
the Cij are not zero. 

Since A is non-singular, dropping the ith row (for any 1 < i < n) results in an 
n — 1 x n matrix whose columns admit a unique (up to scaling) vanishing linear 
combination Y^j=i a j u j- As ^ ^ s perfect, at least N among the coefficients aj are 
non-zero. For each j where aj ^ 0, dropping both the ith row and the jth column 
must result in a full rank matrix of order n — 1. Thus aj ^ 0. Thus, we can 
conclude that for each 1 < i < n, there are at least N indices j where aj i= 0. The 
claim of the lemma follows by applying Corollary 4.5 with m — N. □ 



6. More general results 



In this section we briefly discuss (without detailed proofs) several easy extensions 
of the method to yield some variants and generalizations of our results. 



6.1. Generalizations of Theorem 4.3. Theorem 4.3 and Corollary 4.4 can be 
extended to polynomials with arbitrary degree. One such extension reads as follows: 

Theorem 6.2. Let z\, Z2, ■ ■ ■ z n be i.i.d. random variables which take values and 
1 with probability 1/2. Let k be a fixed positive integer. Let 



/:= 



E 



1<*1, 



where at least n k 1 m of the coefficients a-i_,i 2 ,...i k are at least 1 in absolute value. 
Then for any interval L of length 1 



P(/e/) = 0(— ). 



where = 2 ( fe2 + fe )/ 2 and the implicit constant in O depends on k. 



The proof proceeds via induction on fc, with the base case being the classical 
Littlcwood-Offord lemma and the inductive step closely following that of Theo- 
rem 4.3, including the use of the following generalization of the decoupling lemma 
(also proven by induction on k): 

Lemma 6.3 (Decoupling lemma). Let X\, . . . ,Xk be random variables and E = 
E(Xi, . . . , Xk) be an event depending on the X; t . Then 



P(E(X u ...,X k ))<P( f\ E{Xf,...,Xl)fl 2k 
sc{i,...,fe} 



14 



KEVIN COSTELLO, TERENCE TAO, AND VAN VU 



where Xf := Xi if i £ S and Xf := X[, an independent copy of Xi ifi^S. 

Theorem 4.3 can also be extended to more general classes of variables than the 
Bernoulli random variable (taking values and 1 with equal probability) by a 
nearly identical proof, with the main difference being that the base case, Theorem 
4.1, must be replaced by [3, Theorem 4]. 

6.4. Generalizations of Theorem 1.3. We say that a random variable £ has the 
p-property if 

maxP(£ = c) < p. 

cm v ' 

Let 1 < i,j < n be independent random variables. Assume that there is a 
constant p < 1 (not depending on n) such that for all 1 < i < j < n, £jj has the 
p-propcrty. Observe that we do not require £jj be identical, and that furthermore 
we do not place any requirements on the diagonal elements of the matrix. 

Theorem 6.5. Let^ij, 1 < i < j < n be as above. Let Q n be the random symmetric 
matrix with upper diagonal entries Then Q n is non-singular with probability 
1 — 0(n _1 / 8+<5 ) ; where the implicit constant depends only on p and S. 

To prove this result, it suffices to show that analogues of Lemmas 2.3-2.8 still hold 
for this more generalized model. Lemmas 2.3 and 2.7 (with 2/3 replaced by any 5 
with p < S < 1) follow from the same argument as in the original theorem, except 
that Lemma 3.1 must be replaced by 

Lemma 6.6. Let H be a linear subspace in R n of dimension at most d < n. Let 
v be a vector whose entries are independent random variables all but one of which 
have the p property. Then 

P(v eH)< P n - d - 1 

Proof As before, H can be parameterized by some set of d' < d coordinates. Once 
those coordinates of v are known, the remaining coordinates can each take on at 
most one value for all (v e H), giving a necessary set of {n — d') independent events, 
(n — d' — 1) of which have probability at most p. ■ 

The proof of Lemma 2.4 also goes through, except that Theorem 4.1 must be 
replaced by the following rescaled version of the d = 1 case of [3, Theorem 4]: 

Lemma 6.7. Let z±, . . . ,z n be independent random variables with the p property. 
Let di , . . . ,a n be real deterministic coefficients, with ai ^ for at least k values of 
i. Then for any interval c £ R ; we have 

n 

p£>z, = c ) = o (a + fcr 1/2 ) 
i=i 
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where the implied constant depends only on p. 

A nearly identical decoupling argument now proves an analogue of Theorem 4.3, 
with di now taken for each i to be the number of j for which Cjj is nonzero. Corollary 
4.5 (with the implied constant now depending only on p) and Lemma 2.8 now follow 
as before. 



7. Open questions 

Let us conclude this section with a few open questions. From a quantitative point 
of view, there are two natural ways to strengthen both Questions 1.1 and 1.2. 

Question 7.1. Give an estimate for the determinant. 

Question 7.2. Give an estimate for the probability that the matrix is singular. 

In fact, Question 7.1 seems to be the motivation of Komlos for his original paper 
[5] (see the title of that paper) which started this line of research. There are several 
partial results concerning the model A n . In the rest of this section, it is more 
convenient to assume that the entries of A n (and Q n ) take value 1 and —1 (rather 
than 1 and 0). Under this condition, Tao and Vu [7] showed that almost surely 
det A n has absolute value n^ 1 / 2-0 ^ 1 ^". We conjecture that a similar bound holds 
for Q n . 

Conjecture 7.3. Almost surely, |detQ„| = n (i/2-°(i))». 

Regarding Question 7.2, Kahn, Komlos and Szemeredi [4] proved that the singular 
probability of A n is 0(.999"). This bound has recently been improved [8] to (3/4 + 
o(l))". The conjectured bound is (l/2+o(l))". We conjecture that the same bound 
holds for Q n . 

Conjecture 7.4. The probability that Q n is singular is (1/2 + o(l)) n . 

By considering the probability that the first two rows are equal, it is easy to see that 
(1/2 + o(l))" is a lower bound (one can actually makes a more precise conjecture 
similar to the case with A n ). The proof in this paper showed a upper bound 0(n~ c ) 
for some positive constant c. 

The main obstacle in these questions is the fact that the row vectors of Q ni unlike 
those of A n , are not independent. In fact, if one exposes these vectors one by one, 
then the last few vectors are almost deterministic. The independence among the 
row vectors are critical in all previous papers [4, 7, 8]. It so seems to require a new 
idea to attack these conjectures. 

Acknowledgement. We would like to thank G. Kalai for communicating the problem. 
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